Integrating Social Media Data for Community Detection

نویسندگان

  • Jiliang Tang
  • Xufei Wang
  • Huan Liu
چکیده

Community detection is an unsupervised learning task that discovers groups such that group members share more similarities or interact more frequently among themselves than with people outside groups. In social media, link information can reveal heterogeneous relationships of various strengths, but often can be noisy. Since different sources of data in social media can provide complementary information, e.g., bookmarking and tagging data indicates user interests, frequency of commenting suggests the strength of ties, etc., we propose to integrate social media data of multiple types for improving the performance of community detection. We present a joint optimization framework to integrate multiple data sources for community detection. Empirical evaluation on both synthetic data and real-world social media data shows significant performance improvement of the proposed approach. This work elaborates the need for and challenges of multi-source integration of heterogeneous data types, and provides a principled way of multi-source community detection.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Social Media Data Integration for Community Detection

Community detection is an unsupervised learning task that discovers groups such that group members share more similarities or interact more frequently among themselves than with people outside groups. In social media, link information can reveal heterogeneous relationships of various strengths, but often can be noisy. Since different sources of data in social media can provide complementary inf...

متن کامل

Size Matters: A Comparative Analysis of Community Detection Algorithms

Understanding community structure of social media is critical due to its broad applications such as friend recommendations, user modeling and content personalizations. Existing research uses structural metrics such as modularity and conductance and functional metrics such as ground truth to measure the qualify of the communities discovered by various community detection algorithms, while overlo...

متن کامل

Nonnegative Matrix Tri-Factorization with Graph Regularization for Community Detection in Social Networks

Community detection on social media is a classic and challenging task. In this paper, we study the problem of detecting communities by combining social relations and user generated content in social networks. We propose a nonnegative matrix tri-factorization (NMTF) based clustering framework with three types of graph regularization. The NMTF based clustering framework can combine the relations ...

متن کامل

Mutually Enhancing Community Detection and Sentiment Analysis on Twitter Networks

The burgeoning use of Web 2.0-powered social media in recent years has inspired numerous studies on the content and composition of online social networks (OSNs). Many methods of harvesting useful information from social networks’ immense amounts of user-generated data have been successfully applied to such real-world topics as politics and marketing, to name just a few. This study presents a no...

متن کامل

Using Machine Learning Algorithms for Automatic Cyber Bullying Detection in Arabic Social Media

Social media allows people interact to express their thoughts or feelings about different subjects. However, some of users may write offensive twits to other via social media which known as cyber bullying. Successful prevention depends on automatically detecting malicious messages. Automatic detection of bullying in the text of social media by analyzing the text "twits" via one of the machine l...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011